Phone recognition analysis for trajectory HMM

نویسندگان

  • Le Zhang
  • Steve Renals
چکیده

The trajectory HMM has been shown to be useful for model-based speech synthesis where a smoothed trajectory is generated using temporal constraints imposed by dynamic features. To evaluate the performance of such model on an ASR task, we present a trajectory decoder based on tree search with delayed path merging. Experiment on a speaker-dependent phone recognition task using the MOCHA-TIMIT database shows that the MLE-trained trajectory model, while retaining attractive properties of being a proper generative model, tends to favour over-smoothed trajectory among competing hypothesises, and does not perform better than a conventional HMM. We use this to build an argument that models giving better fit on training data may suffer a reduction of discrimination by being too faithful to training data. This partially explains why alternative acoustic models that try to explicitly model temporal constraints do not achieve significant improvements in ASR.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition

Recently, trajectory HMM has been shown to improve the performance of both speech recognition and speech synthesis. For efficiency, state sequence is required to compute likelihood for trajectory HMM which limits its use to N -best rescoring for speech recognition. Motivated by the success of models with temporally varying parameters, this paper proposes a Temporally Varying Feature Mapping (TV...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Modeling trajectories in the HMM framework

Most state-of-the-art statistical speech recognition systems use hidden Markov models (HMM) for modeling the speech signal. However, limited by the assumption of conditional independence of observations given the state sequence, current HMM's poorly model the trajectory constraints in speech. In [1], we introduced the parallel path HMM, where each phonetic unit is represented by a parallel coll...

متن کامل

Introduce Segmeantal Inner Timewarping into Parametric Trajectory Segment Model for LVCSR

In this paper, a parametric trajectory segment model (PTSM) with segmental inner time warping is proposed to improve the recognition accuracy of large vocabulary continuous speech recognition(LVCSR). The proposed PTSM utilizes the state boundary information provided by HMM system during decoding to do segmental inner time warping. Good alignment between different length realizations of a same p...

متن کامل

Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features

This paper shows that the HMM whose state output vector includes static and dynamic feature parameters can be reformulated as a trajectory model by imposing the explicit relationship between the static and dynamic features. The derived model, named trajectory HMM, can alleviate the limitations of HMMs: i) constant statistics within an HMM state and ii) independence assumption of state output pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006